Goto

Collaborating Authors

 continual deep learning


Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. Recent works address this with weight regularisation. Functional regularisation, although computationally expensive, is expected to perform better, but rarely does so in practice. In this paper, we fix this issue by using a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting. By using a Gaussian Process formulation of deep networks, our approach enables training in weight-space while identifying both the memorable past and a functional prior. Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.


Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. Recent works address this with weight regularisation. Functional regularisation, although computationally expensive, is expected to perform better, but rarely does so in practice. In this paper, we fix this issue by using a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting. By using a Gaussian Process formulation of deep networks, our approach enables training in weight-space while identifying both the memorable past and a functional prior.


Review for NeurIPS paper: Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

All four expert reviewers were positive about this work, and the author rebuttal along with a lively post-rebuttal discussion improved the opinions. I agree with the reviewers that this is a high quality paper and my decision is to accept. I encourage the authors to take reviewer suggestions into account -- especially the promise to provide longer task sequences and the discussion of connections between gradient-based sample selection and the proposed memorable sample selection approach.


Review for NeurIPS paper: Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

What are the real contributions of the paper? The idea of regularizing the outputs (or functional-regularization) has already been explored, as already said in the paper. Combining the idea of regularizing the outputs with memory-based methods is also already explored. Please see GEM [1] and A-GEM [2]. What makes this approach better or important, e.g.


Continual Deep Learning by Functional Regularisation of Memorable Past

Neural Information Processing Systems

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. Recent works address this with weight regularisation. Functional regularisation, although computationally expensive, is expected to perform better, but rarely does so in practice. In this paper, we fix this issue by using a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting. By using a Gaussian Process formulation of deep networks, our approach enables training in weight-space while identifying both the memorable past and a functional prior.


Continual Deep Learning on the Edge via Stochastic Local Competition among Subnetworks

Christophides, Theodoros, Tolias, Kyriakos, Chatzis, Sotirios

arXiv.org Artificial Intelligence

Continual learning on edge devices poses unique challenges due to stringent resource constraints. This paper introduces a novel method that leverages stochastic competition principles to promote sparsity, significantly reducing deep network memory footprint and computational demand. Specifically, we propose deep networks that comprise blocks of units that compete locally to win the representation of each arising new task; competition takes place in a stochastic manner. This type of network organization results in sparse task-specific representations from each network layer; the sparsity pattern is obtained during training and is different among tasks. Crucially, our method sparsifies both the weights and the weight gradients, thus facilitating training on edge devices. This is performed on the grounds of winning probability for each unit in a block. During inference, the network retains only the winning unit and zeroes-out all weights pertaining to non-winning units for the task at hand. Thus, our approach is specifically tailored for deployment on edge devices, providing an efficient and scalable solution for continual learning in resource-limited environments.